The Wolfram Language provides several data structures for representing chemical species at different levels of granularity.
Start with a small representing a single codon.
In[42]:=
bioseq=BioSequence["RNA","AUG"]
Out[42]=
BioSequence
From this bio sequence you can create a or a depending on your application.
In[54]:=
form=ChemicalFormula@bioseq
Out[54]=
In[53]:=
mol=Molecule@bioseq
Out[53]=
Molecule
Equivalence between different representations can be checked easily using .
In[57]:=
{MoleculeMatchQ[mol,bioseq],MoleculeMatchQ[mol,form]}
Out[57]=
{True,True}
As the simplest representation, the formula allows you to find molecular mass and elemental composition.
In[62]:=
form[{"MolecularMass","ElementCounts"}]
Out[62]=
,29,36,12,19,2
The molecule represents all atoms and bonds explicitly and allows computing topological properties or even generating a 3D structure.
In[63]:=
mol[{"AromaticRingCount","HBondDonorCount"}]
Out[63]=
{5,11}
In[64]:=
MoleculePlot3D[mol,PlotTheme->"Spacefilling"]
Out[64]=
The bio sequence representation allows computation at a higher level of abstraction. Convert this sequence into DNA or into a peptide.
In[66]:=
BioSequenceTranscribe[bioseq]
Out[66]=
BioSequence
In[65]:=
BioSequenceTranslate[bioseq]
Out[65]=
BioSequence